Tudor Dumitras, University of Maryland, College Park
Companies facing rampant attacks and data breaches have started turning to artificial intelligence techniques, such as machine learning, for security tasks. A machine learning classifier automatically learns models of malicious activity from a set of known-benign and known-malicious observations, without the need for a precise description of the activity prepared in advance. However, the effectiveness of these techniques primarily depends on the feature engineering process, which is usually a manual task based on human knowledge and intuition. Can we automate this process? Can we build an intelligent system that not only learns from examples, but can also help us build other intelligent systems?
We developed a system, called FeatureSmith, that engineers features for malware detectors by synthesizing the knowledge described in thousands of research papers. As a demonstration, we trained a machine learning classifier with automatically engineered features for detecting Android malware and we achieved a performance comparable to that of a state-of-the-art detector for Android malware, which uses manually engineered features. In addition, FeatureSmith can suggest informative features that are absent from the manually engineered set and can link the features generated to human-understandable concepts that describe malware behaviors.
Sign up to find out more about Enigma conferences: https://www.usenix.org/conference/enigma2017#signup
Watch all Enigma 2017 videos at: http://enigma.usenix.org/youtube